Sains Malaysiana 55(3)(2026): 491-501
http://doi.org/10.17576/jsm-2026-5503-11
A
Mini-Batch Algorithm with Adaptive Learning Rate Strategy
(Algoritma Kelompok Mini dengan Strategi Kadar Pembelajaran Adaptif)
WEIJUAN SHI1, ADIBAH
SHUIB2,* & ZURAIDA ALWADOOD2
1College of Mathematics and Finance, Hunan University of
Humanities, Science and Technology, Loudi, China
2Faculty of Computer and Mathematical Sciences (FSKM), Universiti Teknologi MARA (UiTM),
40450 Shah Alam, Selangor, Malaysia
Received: 31
December 2024/Accepted: 19 February 2026
Abstract
To address the limitations of manually
selecting step sizes or using diminishing step size sequences, which can slow
convergence in mini-batch algorithms, we propose a strategy for automatically
calculating step sizes by employing the Positive Defined Stabilized Barzilai-Borwein (PDSBB) method. The PDSBB step size is integrated
into the mini-batch semi-stochastic gradient descent (mS2GD) algorithm,
creating a novel algorithm called mS2GD-PDSBB. Based on the linear convergence
result, the computational complexity is characterized in terms of the expected
number of stochastic gradient evaluations required to achieve a prescribed
accuracy level. Computational experiments on benchmark instances are conducted
to evaluate the convergence behavior of the proposed algorithm. Suitable
mini-batch size leads the mS2GD-PDSBB algorithm to successfully attain the
performance consistent to the base algorithms. The numerical experiments
demonstrate that the proposed mS2GD-PDSBB algorithm achieves stable and fast
convergence with the adaptive step-size strategy. In particular, the algorithm
shows reduced sensitivity to the choice of initial step sizes and consistently
outperforms or matches mS2GD and mS2GD-BB in terms of objective sub-optimality
and test error across different dataset.
Keywords: Adaptive step size; convergence
rate; mS2GD algorithm; PDSBB method
Abstrak
Untuk menangani batasan pemilihan saiz langkah secara manual atau penggunaan jujukan saiz langkah yang semakin berkurangan,
yang boleh memperlahankan penumpuan dalam algoritma kelompok mini, kami mencadangkan strategi untuk mengira saiz langkah secara automatik dengan menggunakan kaedah Positive
Defined Stabilized Barzilai-Borwein (PDSBB). Saiz langkah PDSBB disepadukan ke dalam algoritma penurunan kecerunan separa stokastik kelompok mini (mS2GD), mewujudkan algoritma baharu yang dipanggil mS2GD-PDSBB. Berdasarkan hasil penumpuan linear, kerumitan pengiraan dicirikan dari segi bilangan penilaian kecerunan stokastik yang dijangka yang diperlukan untuk mencapai tahap ketepatan yang ditetapkan. Uji kaji pengiraan menggunakan contoh penanda aras dijalankan untuk menilai tingkah laku penumpuan algoritma yang dicadangkan. Saiz kelompok mini yang sesuai membawa algoritma mS2GD-PDSBB untuk berjaya mencapai prestasi yang konsisten dengan algoritma asas. Uji kaji berangka menunjukkan bahawa algoritma mS2GD-PDSBB yang dicadangkan mencapai penumpuan yang stabil dan pantas menerusi strategi saiz langkah adaptif. Secara khususnya, algoritma ini menunjukkan sensitiviti yang berkurangan terhadap pilihan saiz langkah awal dan secara konsisten mengatasi atau menyamai mS2GD dan mS2GD-BB dari segi sub-keoptimuman objektif dan ralat ujian merentasi set data yang berbeza.
Kata kunci: Algoritma mS2GD; kaedah PDSBB; kadar penumpuan; saiz langkah adaptif
REFERENCES
Barzilai, J. & Borwein, J.M. 1988. Two-point step size gradient
methods. IMA Journal of Numerical Analysis 8(1): 141-148.
Berahas, A.S., Nocedal, J. & Takáč, M. 2016. A
multi-batch L-BFGS method for machine learning. Advances in Neural
Information Processing Systems. pp. 1063-1071.
Chen, G., Li, Y., Zhang, J. & Huang, K. 2022. A
survey of the four pillars for small object detection. Foundations and
Trends in Computer Graphics and Vision 14(1-2): 1-145.
Condat, L. 2023. Proximal splitting algorithms for convex
optimization: A tour of recent advances. SIAM Review 65(3): 699-763.
Hosny, S., Shouman, M.A. & Ali, A.A. 2023. Survey on
compressed sensing over the past two decades. Array 19: 100308.
Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S.
& Shah, M. 2022. Transformers in vision: A survey. ACM Computing Surveys 54(10): 200.
Konečný, J., Liu, J., Richtárik, P. & Takáč,
M. 2016. Mini-batch semi-stochastic gradient descent in the proximal setting. IEEE
Journal on Selected Topics in Signal Processing 10(2): 242-255.
Li, M., Zhang, T., Chen, Y. & Smola, A.J. 2014.
Efficient mini-batch training for stochastic optimization. Proceedings of
the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining. pp. 661-670.
Lin, T., Wang, Y., Liu, X. & Qiu, X. 2022. A survey
of transformers. AI Open 3: 111-132.
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J.,
Liu, X. & Pietikäinen, M. 2020. Deep learning for generic object detection:
A survey. International Journal of Computer Vision 128: 261-318.
Ma, K., Zeng, J., Xiong, J., Xu, Q., Cao, X., Liu, W.
& Yao, Y. 2018. Stochastic non-convex ordinal embedding with stabilized
Barzilai-Borwein step size. Proceedings of the 32nd AAAI Conference on
Artificial Intelligence 458: 3738-3745.
Mittal, P., Ghosh, A. & Singh, S.K. 2024. A
comprehensive survey of deep learning-based lightweight object detection on
edge devices. Artificial Intelligence Review 57: 242.
Nagahara, M. 2024. A survey on compressed sensing
approach to systems and control. SN Computer Science 5(5): 442.
Nesterov, Y. 2013. Gradient methods for minimizing
composite functions. Mathematical Programming 140(1): 125-161.
Nguyen, L.M., Liu, J., Scheinberg, K. & Takáč,
M. 2017. SARAH: A novel method for machine learning problems using stochastic
recursive gradient. Proceedings of the 34th International Conference on
Machine Learning. pp. 4009-4022.
Parikh, N. & Boyd, S. 2014. Proximal algorithms. Foundations
and Trends in Optimization 1(3): 127-239.
Reddi, S.J., Hefny, A., Sra, S., Póczós, B. & Smola,
A. 2016. Stochastic variance reduction for nonconvex optimization. Proceedings
of the 33rd International Conference on Machine Learning. pp. 314-323.
SAS. 2023. Machine Learning: What it is and why it
matters.
Shao, Y., Wang, Q. & Han, D. 2022. Efficient methods
for convex problems with Bregman Barzilai–Borwein step sizes. Pacific
Journal of Optimization 18(2): 333-348.
Shi, W., Shuib, A. & Alwadood, Z. 2023. Stochastic
variance reduced gradient method embedded with positive defined stabilized
Barzilai–Borwein. IAENG International Journal of Applied Mathematics 53(4): 1682-1687.
Tseng, P. 2000. A modified forward-backward splitting
method for maximal monotone mappings. SIAM Journal on Control and
Optimization 38(2): 431-446.
Wen, L., Cheng, Y., Fang, Y. & Li, X. 2023. A
comprehensive survey of oriented object detection in remote sensing images. Expert
Systems with Applications 224: 119960.
Xiao, L. & Zhang, T. 2014. A proximal stochastic
gradient method with progressive variance reduction. SIAM Journal on
Optimization 24(4): 2057-2075.
Yang, Z. 2024. SARAH-M: A fast stochastic recursive
gradient descent algorithm via momentum. Expert Systems with Applications 238:
122295.
Yang, Z., Chen, Z. & Wang, C. 2021. Accelerating
mini-batch SARAH by step size rules. Information Sciences 558: 157-173.
Yang, Z., Wang, C., Zhang, Z. & Li, J. 2019.
Mini-batch algorithms with online step size. Knowledge-Based Systems 165: 228-240.
Yang, Z., Wang, C., Zhang, Z. & Li, J. 2018. Random
Barzilai–Borwein step size for mini-batch algorithms. Engineering
Applications of Artificial Intelligence 72: 124-135.
*Corresponding
author; email: adibah253@uitm.edu.my